Adaptation and excitation effects underlying human auditory sensory memory 
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1 Introduction 

The most prominent response in the auditory event- 
related field is the supratemporally generated Nlm, 
which peaks about 100 ms after stimulus onset and 
lasts for approximately 100 ms. The Nlm is 
exceptionally sensitive to stimulus rate, with it's 
amplitude decreasing sharply as ISI is decreased. 
Previous research [1-3] has suggested that this 
attenuation is due to memory traces left by the 
previous stimuli. Further, as the duration of this 
memory trace coincides with estimates for the 
duration of auditory sensory memory, Nlm could 
function as an index for this memory mechanism. 
While the rate sensitivity of the Nlm amplitude is 
well-established, the neural mechanisms underlying 
it remain unknown. Although the notion of 
"refractoriness" is sometimes used [see 1], the ERP 
literature suffers from a distinct opaqueness as to 
what cellular mechanism this might refer to. The 
cortex contains a range of slow inhibitory 
mechanisms [4] which could be considered natural 
candidates for the neural mechanisms of 
refractoriness and hence for Nlm attenuation. 
These are spike frequency adaptation, related to 
non-synaptic after-hyperpolarizing currents lasting 
for seconds, and GABA b mediated slow synaptic 
inhibition, lasting for hundreds of milliseconds. 

In the present study, the theme of Nlm and 
auditory sensory memory is further elaborated on. 
We model the generation of the Nlm by using a 
computational model of cortical dynamics. The 
predictions of the model were tested with MEG 
measurements in humans. 

2 Methods 

2.1 Modelling methods 

MEG is mostly due to synaptic currents in parallel 
apical dendrites of pyramidal cells oriented 
tangentially to the skull surface [5], The Nlm field 
pattern is roughly dipolar, implying a localized 
activation of a population of pyramidal cells. We 
modelled the neural dynamics of this population 
through an idealized, single-compartment 


description of the pyramidal cell embedded in an 
approximation of the canonical cortical microcircuit 
[6, 7], The population received recurrent excitation 
(rec), inhibition (inh) with fast and slow decay 
times, slow adaptation currents (adap), as well as 
excitatory (+) and inhibitory (-) thalamo-cortical 
afferent input (aff): 
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where u is the membrane potential, 7 X are the cross¬ 
membrane currents and c m is the membrane 
capacitance. While the recurrent input was 
determined through u, the inhibitory input was 
received from a population of inhibitory cells, 
receiving excitatory input from the pyramidal cells 
as well as feedback inhibition. The leak current was 
determined by 7 leak = r m ' (£ lcak - u) , where £ieak is 
the resting membrane potential. The synaptic (rec, 
inh, aff) and adaptation (adap) currents were 
determined through respective channel reversal 
potentials E x and conductances Gft): I ft) = [E x - 
u(t )] Gft). For synaptic currents (rec, aff, and fast 
inh) the conductance changes were directly related 
to pre-synaptic firing. By assuming a uniform 
distribution of firing phases at M synapses, the total 
conductance can be approximated by 
G(t) = Mgg max f(t), where g is a normalizing 
constant, gmax is the peak conductance change at a 
single synapse, and/is the pre-synaptic firing rate. 
The latter was related to the membrane potential 
through / (u) = l[«]/ max [l - exp(« / s)], where / max 
is the maximum firing rate, s is a constant, and 1[.] 
is the step-function. 

The adaptation currents depended on the firing rate 
of the pyramidal cells, and the associated 
conductance was described by the dynamical 
equation 

TG a (t) = -G a (y) + Lf(u(t)), (2) 

where x is the decay constant and L is a constant 
determining the strength of adaptation. For slow 





inhibition, a similar equation was used to describe 
the conductance at the associated channel (with u 
being replaced by the membrane potential of the 
inhibitory cell population). 

We derived an expression for the MEG in terms of 
the cross-membrane currents of the pyramidal cell 
population as: 
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where M pop is the number of cells in the population, 
l is the length of the cells, r is the distance between 
the population and the sensor, I\ is the sum of 
inhibitory currents (inh, adap), and the 
function 'F is determined by the distribution of leak 
current along the cell (x). This approximation 
assumes that the excitatory input targets one end of 
the cell (the apical dendrites) and inhibitory input 
the other (the soma). 

In simulations, we used 2-sec sequences of 50-ms 
stimuli with different ISIs. Parameter values were 
in the same range as those used in previous 
computational work [e.g., 8, 9], We measured the 
pre-stimulus baseline amplitude (5 baS e) for the Mh 
stimulus as the value of the MEG at stimulus 
presentation, as well as the peak amplitude (5 peak ) 
of the response to the Nth stimulus. The baseline- 
corrected peak amplitude was defined as B Ap = 
B pe ak-B base . 

2.2 Experimental methods 

Auditory ERFs in 10 human subjects were 
measured for 2-sec binaurally presented stimulus 
sequences containing Atones, separated by 10-sec 
silences. The sequences contained 50-msec tones 
(including 10-ms rise and fall times, 70 dB SPL) 
presented with eight different ISIs ranging from 50 
( N = 40) to 1950 (N= 2) msec. In addition, ERFs for 
a continuous 2-sec tone were recorded (N=l). The 
subject was instructed to ignore the auditory stimuli 
and to concentrate on reading self-selected text. 
MEG recordings were carried out in DC-mode with 
a 122-channel whole-head magnetometer using a 
sample rate of 400 Hz. Approximately 65 responses 
were averaged over a time period of 3.5 sec for 
each N. The responses were baseline-corrected with 
respect to a pre-sequence baseline of 100 ms and 
low-pass filtered at 30 Hz. Low frequency noise 
was eliminated by measuring the noise in the empty 
chamber and utilizing the signal-space projection 
method [10] to project its contribution out from the 
data. Eye movements were recorded from 
electrodes placed above the left eye and below the 


outer canthus of the left eye. Artefacts were defined 
as amplitudes exceeding 150 pV. 

Amplitude measurements were made for each N, 
each subject, and for both hemispheres. The pre¬ 
stimulus baseline amplitude B bas e for each tone was 
determined by measuring the mean ERF in a 20 ms 
time window immediately preceding tone 
presentation. The peak amplitude B pea k for each 
tone was determined as the mean ERF in a 20 ms 
time window around the maximum of the response 
during a 150-ms post-onset time window. The 
baseline-corrected peak amplitude was defined as 
B Ap = B peak -B base . 



Figure 1: Simulation results when N = 40 stimuli 
with an ISI of 50 ms were presented in a 2-sec 
window. Depending on the strength of the 
adaptation and inhibitory currents, the baseline- 
corrected peak amplitude of the response to the 
final stimulus (onset t = 1.95 sec) can be 
diminished for three reasons: In (A), the peak 
amplitude is attenuated. In (B), the peak amplitude 
is unaffected, but the pre-stimulus baseline 
amplitude is raised. In (C), a combination of 
attenuated peak amplitude and raised baseline 
amplitude contribute to a diminished baseline- 
corrected peak amplitude. The total number of cells 
was in the 10 4 -10 s range 

3 Results 

3.1 Simulation results 

The simulations demonstrated that an attenuation of 
the baseline-corrected peak amplitude for responses 
to stimuli presented at rapid rates can occur in three 
different scenarios. (1) When using parameter 
values allowing strong adaptation and inhibition to 
kick in after the presentation of the first stimulus, 
the peak amplitude of the responses to subsequent 
stimuli is strongly attenuated (Fig. 1A). This 
explains most of the attenuation of the baseline- 






corrected peak amplitude, as the pre-stimulus 
baseline amplitudes are weakly affected. (2) When 
adaptation and inhibition is weakened, the peak 
amplitude for the response remains unaffected 
across the serial position of the stimulus (Fig. IB). 
In addition, at the onset of each stimulus, excitation 
due to previous stimuli has decayed only 
minimally. This causes a strongly raised pre¬ 
stimulus baseline amplitude, and, consequently, a 
diminished baseline-corrected peak amplitude. (3) 
When intermediate strengths of adaptation and 
inhibition are used, the peak amplitude attenuates 
across serial position (Fig. 1C). Also, as 
depolarization due to previous stimuli is still 
present at stimulus presentation, the pre-stimulus 
baseline amplitude is raised for stimuli following 
the first. Therefore, the baseline-corrected peak 
amplitude is strongly attenuated, as in the previous 
two cases. 
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amplitude is demonstrated in Figure 2. Depending 
on the strength of inhibition and adaptation, the 
peak amplitude could either increase as a function 
of ISI (Figs. 2A & E), or remain relatively stable 
(Fig. 2C). Similarly, the baseline amplitude could 
either remain stable (Fig. 2A) or decrease as a 
function of ISI (Figs. 2C & E). However, in all 
cases, the baseline-corrected peak amplitude 
demonstrated strong stimulus rate dependence, 
growing monotonically as a function of ISI (Figs. 
2B, D, F). Further, this behaviour could be 
characterized by an exponentially saturating 
function used in previous studies to describe Nlm 
rate dependence [2, 3]. 
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Figure 2: Simulation results showing the ISI- 
dependence of the pre-stimulus baseline amplitude 
(S ba5 e), the peak amplitude (B pCilk ) and the baseline- 
corrected peak amplitude (B Ap ). Depending on the 
adaptation and inhibition, the behaviour of B Ap can 
be due to 5 pea k increasing (A&B) or 5 base 
decreasing (C&D) as a function of ISI, or to a 
combination of these (E&F). In all cases B Ap is an 
increasing function of ISI which can be described 
by an exponentially saturating function used in 
[2,3]. 

The ISI-dependence of the baseline amplitude, the 
peak amplitude and the baseline-corrected peak 


Figure 3: Experimental results averaged over 10 
human subjects. In (A), magnetic measurements 
from the right fronto-temporal sensor displaying 
maximal Nlm’s. N refers to the number of stimuli 
presented in the time window 0 to 2 sec. In (B), the 
peak amplitude and the pre-stimulus baseline of the 
response to the last stimulus is plotted against the 
ISI of the sequence. Fast stimulation rates attenuate 
Speak and raise 5 base . In (C), the combined effect of 
this is the ISI-dependence of the baseline-corrected 
peak amplitude. An exponentially saturating 
function used in [2,3] has been fitted to the data. 

3.2 Experimental results 

The tones elicited prominent responses over both 
hemispheres, with the Nlm fields being maximal 
over the fronto-temporal areas. In all conditions, 
















































































the first stimulus elicited a large Nlm with a 
magnitude of around 100 fT/cm. For long ISIs 
(>280 ms, N < 8), the Nlm’s for the subsequent 
stimuli were attenuated to around 50 fT/cm. When 
shorter ISIs were used (<280 ms, N > 8), the peak 
amplitudes of the responses were again attenuated 
by 50%. In addition, the pre-stimulus baselines 
were raised, so that as the stimulus rate was 
increased, a sustained response became evident. 
The responses for the shortest ISIs (100 & 50 ms, N 
= 20 & 40) closely resembled the sustained field 
elicited by the continuous (N = 1) 2-sec tone. The 
measurements therefore revealed that with high 
stimulus rates (short ISIs, large Ns) the baseline- 
corrected peak amplitude of the Nlm is decreased 
by the combined effect of: (1) the pre-stimulus 
baseline amplitude being increased, and (2) the 
peak amplitude being decreased. 

4 Discussion 

The MEG recordings support our model that the 
Nlm amplitude dependence on stimulus rate is due 
to (1) depolarization effects raising the pre-stimulus 
baseline and (2) adaptation and inhibition 
suppressing the peak amplitude of the Nlm. The 
combined effect of these is the apparently strong 
attenuation of the response when pre-stimulus 
baseline correction is used. This correction has 
routinely been utilized in previous research [e.g., 1- 
3], which indicated that the Nl(m) amplitude can 
be reduced by up to two orders of magnitude when 
rapid stimulus rates are employed. Our results 
contradict these findings in demonstrating that the 
relative variation in the peak amplitude of the 
auditory ERF across stimulation rates is due to 
amplitude reductions of only 50% (Fig. 3). Also, 
our observations show that the pre-stimulus 
baseline correction method can lead to serious 
under-estimations of cortical activity when fast 
stimulation rates are used. 

Our combined computational and experimental 
results indicate that auditory sensory memory could 
manifest itself both in excitatory activity traces with 
a lifetime in the order of 100 ms (evident in the fast 
decaying sustained responses in Fig. 3A), and in 
inhibitory effects with lifetimes in the order of 1 sec 
(evident in the attenuation of the peak amplitude, 
Fig. 3A & B). These neural mechanisms might 
account for the short and long auditory memory 
stores suggested by behavioural evidence and 
experienced as sensation and memory, respectively 
[ 11 ]. 
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